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Abstract — In this paper we establish the connection between 
the Orthogonal Optical Codes (OOC) and binary compressed 
sensing matrices. We also introduce deterministic bipolar m x n 
RIP fulfilling ±1 matrices of order k such that « llflilf l] • 
The columns of these matrices are binary BCH code vectors 
where the zeros are replaced by — 1. The samples obtained by 
these matrices can be easily converted to the original sparse 
signal; more precisely, for the noiseless samples, the simple 
Matching Pursuit technique, even with less than the common 
computational complexity, exactly reconstructs the sparse signal. 
In addition, we combine the binary and bipolar matrices to form 
ternary sensing matrices ({0, 1, —1} elements) that satisfy the RIP 
condition. 

Index Terms — Compressed Sensing, Deterministic Matrices, 
Restricted Isometry Property , BCH codes, Orthogonal Optical 
Codes (OOC). 



I. Introduction 

MINIMIZATION of the number of required samples 
for unique representation of sparse signals has been 
the subject of extensive research in the past few years. The 
field of compressed sensing which was first introduced in (TJ 
and further in Q, (5) deals with reliable (high probability) 
reconstruction of a n x 1 but fc-sparse vector x„ x i from its 
linear projections (y mx i) onto an m-dimensional (to <C n) 
space: y mx i = $mxnX n xi- The two main concerns in com- 
pressed sensing are 1) selecting the sampling matrix <& mX n 
and 2) reconstruction of x nx i from the measurements y mx i 
by exploiting the sparsity constraint. 

The sampling matrix is usually treated by random selection 
of the entries; among the well-known random matrices are 
i.i.d Gaussian [1] and Rademacher Q matrices. In general, 
the exact solution to the second concern, is shown to be an 
NP-complete problem |4); however, if the number of samples 
(to) exceeds the lower bound of to > 0(klog(n/k)), i\ 
minimization (Basis Pursuit) can be performed instead of 
the exact £q minimization (sparsity constraint) with the same 
solution for almost all the possible inputs |4), El- There are 
also other techniques such as greedy methods [5], |6| that can 
be used. 

In this paper we are interested in deterministic as opposed 
to random sampling matrices. Deterministic sampling matrices 
are useful because in practice, the sampler should finally 
choose a deterministic matrix; realizations of the random 
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matrices are not guaranteed to work. Moreover, by proper 
choice of the matrix, complexity or compression rate may be 
improved. In deterministic sampling matrix, we are looking 
for m X n matrices which obey the RIF0 of order fc. It is 
well-known that any k columns of a k x n Vandermond 
matrix are linearly independent; thus, if we normalize the 
columns, for all values of n, the new matrix satisfies the RIP 
condition of order k. In other words, arbitrary RIP-constrained 
matrices could be constructed in this way; however, when 
n increases, the constant Sk rapidly approaches 1 and some 
of the k x k submatrices become ill-conditioned [9| which 
makes the matrix impractical. In 1101 . p 2 x p r+1 matrices 
(p is a power of a prime integer) with 0, 1 elements (prior 
to normalization) are proposed which obey RIP of order k 
where kr < p. Another binary matrix construction with 
to = A;2 c '( loglogri ) measurements (E > 1) is investigated in 
ifTTI which employs hash functions and extractor graphs. The 
connection between coding theory and compressed sensing 
matrices is established in [12| where second order Reed- 
Muller codes are used to construct 2x2 2 matrices with 
±1 elements; unfortunately, the matrix does not satisfy RIP 
for all k-sparse vectors. Complex to 2 x to matrices with chirp- 
type columns are also conjectured to obey RIP of some order 
fl3l . Recently, almost bound-achieving matrices have been 
proposed in lfl4l which, rather than the exact RIP, satisfy 
statistical RIP (high probability that RIP holds). In this paper, 
in addition to establishing the connection between the binary 
sampling matrices and optical codes, we explicitly introduce 

(2 — 1) x 2 matrices with ±1 elements which obey 

the exact RIP for k < 2 J . The new construction is achieved by 
replacing the zeros of the linear binary block codes (specially 
BCH codes) by —1. In this approach, we require binary codes 
with minimum distances as large as almost half their code 
length; the existence of these codes will be shown by providing 
BCH codes. 

The rest of the paper is organized as follows: Binary 
sampling matrices are discussed in the next section. We will 
show that the OOC codes can be used for sampling matrix 
reconstruction. Furthermore, we show that the binary matrices 

'We say that the matrix A mxn obeys Restricted Isometry Property (RIP) 
|2| of order fc with constant < S k < 1 (RIC) if for all fc-sparse vectors 
x„ x i, we have: 

lAxll 2 

1 - S k < „ „, 2 < 1 + S k 



RIP is a sufficient condition for stable recovery. The basis pursuit and 
greedy methods can be applied for recovery of fc-sparse vectors from noisy 
samples with good results if the matrix A obeys RIP of order 2fc with a good 
enough constant &2k (U> GD- 
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cannot achieve the bound predicted by the theory of ran- 
dom measurements in compressed sensing. In section [HI] the 
construction of bipolar matrices using error correcting codes, 
especially BCH codes, is studied. We combine the binary and 
bipolar matrices in section[lV]to form ternary matrices and we 
present the recovery methods to obtain the sparse signal from 
the samples in section [V] Section [VTl represents the numerical 
simulations and finally, section [VITI concludes the paper. 



II. Binary Sampling Matrices 

In this section, we first introduce the means of producing 
RIP-fulfilling matrices. The approach is similar to iflOl and is 
not restricted to the binary matrices. 

Let A m xn be a real matrix with normalized columns such 
that the absolute value of the inner product of each two 
columns does not exceed A. Let B mX fc be any matrix com- 
posed of k different columns of A and define the Grammian 
matrix Gkxk = B T B. Since the columns of B are normal, 
the diagonal elements of G are all equal to 1; moreover, the 
absolute values of the non-diagonal elements of G do not 
exceed A. Therefore, we have: 

Vl<i<k: l&jl < A(/c- 1) (!) 

From Gershgorin circle theorem, we know that the eigen- 
values of G lie in the interval [1 — A(fc — 1) , 1 + A(fc — 1)]; 
thus, if k is small enough such that 6k = X(k — 1) < 1, the 
matrix A TOXIl satisfies RIP of order k with the constant 8k- 
In other words, in order to construct a sampling matrix for 
compressed sensing, we introduce to x n matrices for which 
we have A < -r^j. 

Binary sampling matrices are RIP-fulfilling matrices with 
0, 1 elements prior to column normalization. A subset of such 
matrices was previously studied in the field of Optical Code 
Division Multiple Access (OCDMA) with the name of OOC 
1 15 1; since in the optical communication only positive values 
can be transmitted, each user is assigned a binary vector 
(signature) with a fixed weight (number of l's) where the inner 
product of different vectors are small compared to the weight 
(in contrast to what OOC stands for, the signatures are not 
orthogonal). A useful upper bound (not necessarily achievable) 
for the maximum number of such binary vectors is given in 
1 16|: if R(m, w, A) stands for the maximum number of to x 1 
binary vectors with weight w such that the inner product of 
each two does not exceed A (A <E Z), we have: 



i?(m, w, A) < 



1 



A 



1 L l w-X j J 



(2) 



where [x\ represents the largest integer not greater than x. 
Although the small value of the inner product of the signatures 
is the main key for proper detection of the communicated 
message in a multi-access scenario, in asynchronous cases, 
the circular cross correlation (inner product of a signature 
with the circularly shifted versions of another) and autocorre- 
lation (inner product of a signature with its circularly shifted 
versions) are as important. Therefore, instead of a simple A, 
two parameters are involved: A a denotes the maximum value 
of the circular auto-correlation among all the code vectors 



when at least one and at most to — 1 units of shift are 
applied and A c denotes the maximum value of the circular 
cross-correlation among all the pairs. The OOC vectors are 
characterized by (to, w, X a , A c ); nonetheless, it is possible that 
the two correlation parameters are equal (A a — X c — A), in 
this case, the OOC is referred to as a (to, w, A)-code. 

Now let A be a set of OOC vectors of size m with weight w 
and the correlation parameters A a = A c = A; we also include 
all the possible circularly shifted versions of the codes in A. 
According to the definition of OOC's, the inner product of 
each pair in A is upper bounded by A. We construct the matrix 
A mX n by the normalized versions of the column vectors in 
A where n — \A\ (the order of the columns is unimportant). 
With respect to the upper bound on the inner product of the 
vectors in A, it is easy to verify that the matrix A satisfies 
RIP of order k < 1 + v- Below we will only discuss one of 
the OOC designs using Galois fields |17|: 

Let q = 16 a where a S N and let F = GF(q) with the 
primitive root a. It is clear that 5\q — 1 which confirms the 
existence of an integer d such that q = 5eJ + 1. Define: 



T>i = {a d+ \a 2d ^ 



'} , < i < d - 1 



(3) 



Since the number of l's are usually far less than that of 0's, 
it is common to represent the OOC vectors by their nonzero 
locations. For the above design, the length of the codes is 
equal to q — 1 (to = 16° — 1) and the nonzero locations of 
each code vector is given in {Ci} d ll: 



d = log a (Di - 1) , 1 < * < d- 1 



(4) 



Note that 1 £ T>q and Vq is not used for code construction. 
It is shown in ifTTl that the above method produces 
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OOC vectors with the characteristic (16 a — 1, 5, 2). Since in the 
construction of the sampling matrix for compressed sensing 
we include the circular shifts of the OOC vectors, we obtain a 
matrix with the size (16° — 1) x n where n J$ — 
for the RIP order of k = 1+ [|J = 3 and the constant S 3 = 1 — 
5-2*(fc-i) _ q g j n |Tpyj| employing the same approach, other 
(n,w,2) OOC vectors with larger w's (larger fc's in our case) 
are introduced which are claimed to be optimal considering 
the Johnson's inequality d2j. 

A matrix design independent of OOC codes is given in ifTOl 
that constructs p 2 x p r+1 binary matrices with the column 
weight of p (prior to normalization) such that the inner product 
of each two columns does not exceed r (~ after normalization). 
Here p is a power of a prime integer; the matrix construction 
is based on polynomials in GF(p). Although the introduced 
matrices do not achieve the bound predicted by the theory of 
random compressed sensing, using @, we show that these 
structures are asymptotically optimal when — > oo: 



r+l 
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> lim e ~ = e° = 1 (5) 

Besides, using (0, it can be shown that binary matrices are 
in general unable to reach the predicted compressed sensing 
bound unless w = 0(m). 

III. Bipolar Matrices via Linear Codes 

In this section, we will describe the connection between 
the sampling matrix and coding theorjH Since the parameters 
k,n are used in both compressed sensing and coding field, 
we distinguish the two by using the □ notation for coding 
parameters; e.g., h refers to the code length while n denotes 
the number of columns of the sampling matrix. 

Let C(n, k;2) be a linear binary block code and l r ~ ix i be 
the all 1 vector. We say C is 'symmetric' if l^xi G C. For 
symmetric codes, if a„ x i is a code vector, due to the linearity 
of the code, complement of a rax i which is defined as a nx i © 
1^x1 is also a valid code vector; therefore, code vectors consist 
of complement couples. 

Theorem 1: Let C(n, fc; 2) be a symmetric code with the 
minimum distance d m in and let A fix2 s_! be the matrix 
composed of code vectors as its columns such that from each 
complement couple, exactly one is selected. Define: 




Then, A satisfies the RIP with the constant 8^ = (k — l)(l — 
2 <kin) f or k < + 1 (k is the RIP order). 

Proof. First note that the columns of A are normal. In fact 
2A~ v9 fc_! — tJ-i is the same matrix as A where zeros 
are replaced by —1 (bipolar representation); hence, absolute 
value of each element of A is equal to which reveals that 

Vn 

the columns are normal. 

To prove the RIP, we use a similar approach to that of 1 10 1; 
we show that for each two columns of A, the absolute value 
of their inner product is less than "~ 2 ^'" i ' 1 . Let as x i,b^ xl 
be two different columns of A and a ?l x i , b rl x i be their 
corresponding columns in A. If a and b differ at I bits, we 
have: 

(a,b) = iflx(n-/) + (-l)xA =!L^ (7 ) 
n \ J n 

Moreover, b and a © lf ix i (complement of a) differ at h — I 
bits and since all the three vectors {a, a© lft X i, b} are 
different code words (from each complement couple, exactly 

2 At the time of submitting this paper, we realized that the same connection 
is recently established in "On Random Construction of a Bipolar Sensing 
Matrix with Compact Representation," IEEE Inf. Theo. Workshop, 2009 by 
Tadashi Wadayama. However, our work has been carried out independently 
and concurrently and we focus on the deterministic design of the codes rather 
than the probabilistic structure. 



one is chosen and thus b ^ affil^xi), both I and n — I should 
be greater than or equal to d m i n , i-e-,: 

^ dmin . 7 A 

~ j ^ j = ^ > &min _ t _ &min 

=> \n - 2l\ < n - 2d min (8) 

Note that 0,~ ix i, lf ix i G C and for each code vector a, either 
rf(0f ix i,a) or d(lfi X i,a) cannot exceed ^; therefore, n — 
2d m i n > 0. Combining (0 and ^ we have: 

|(a,b)|< "~ 2cU " (9) 
n 

which proves the claim on the inner product of the columns 
of A. This result together with Gershgorin circle theorem 
discussed in the previous section, proves the theorem ■ 

The above theorem is useful only when d m i n is close to 
^ (denominator for the upper bound of k), which is not the 
case for the common binary codes. In fact, in communication 
systems, parity bits are inserted to protect the main data 
payload, i.e., k bits of data are followed by n — k parity 
bits. In this case, we have d m in < n — k + 1; thus, to have 
d m in ~ f , the number of parity bits should have the same 
order as the data payload which is impractical. In the next 
section we show how these types of codes can be designed 
using the well-known BCH codefl 

A. BCH codes with large d m m 

Since the focus in this section is on the design of BCH 
codes with large minimum distances, we first briefly review 
the BCH structure. 

BCH codes are a class of cyclic binary codes with h = 
2 m — 1 which are produced by a generating polynomial g{x) £ 
GF(2)[x] such that g(x)|a; 2 " ' _1 + 1 ifTHl . According to a result 
in Galois theory, we know: 

x im ~ 1 + l= Yl (x-r) (10) 
r e GF(2" 1 ) 

Hence, the BCH generating polynomial can be decomposed 
into the product of linear factors in GF(2 m )[x}. Let a E 
GF(2 m ) be a primitive root of the field and let a 1 be one of the 
roots of g(x). Since g(x) E GF(2)[x], all conjugate elements 
of a 1 (with respect to GF(2)) are also roots of g(x). Again 
using the results in Galois theory, we know that these conju- 
gates are different elements of the set {ot l21 }™L^. Moreover, 
since a 2 '" -1 = 1, ix = ^(mod 2™ — 1) implies a n = a 12 
which reveals the circular behavior of the exponents. 

The main advantage of the BCH codes compared to other 
cyclic codes is their guaranteed lower bound on the minimum 
distance ifTSl : if a 11 , . . . , a 1 " 1 are different roots of g(x) (not 
necessarily all the roots) such that i%, . . . , id form an arithmetic 
progression, then d m in > d + 1. 

3 BCH codes are considered due to the existence of a deterministic lower 
bound on their minimum distance. Although there exists a similar bound for 
Reed-Muller codes, they can only produce sampling matrices with small RIP 
order (A;), due to their small minimum distance compared to the block size. 
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Now we get back to our code design approach. We construct 
the desired code generating polynomials by investigating their 
parity check polynomial which is defined as: 



h(x) 



1 



(11) 



In other words, each field element is the root of exactly one 
of the g{x) and h(x). We construct h(x) by introducing its 
roots. Let I < rh — 1 be an integer and define 



,(0 _ 



L +2'-l 



} 



(12) 



Note that the definition of depends on the choice of the 
primitive element (a). We further define 7x~} as the subset of 
Q)!/ which is closed with respect to the conjugate operation: 



n 



(l) A 



{reG% I VjeN: r v £ Q^} 



(13) 



The above definition shows that if r € t6~] then its conjugate 
r v £ Hfa- Now let us define h(x): 



h(x) = l[ (x - r) 



(14) 



reH), 



As discussed before, if r is a root of h(x), all its conjugates 
are also roots of h(x); therefore, h(x) £ GF(2)[x], which is 
a required condition. Moreover, 



1 = « G^ 



=> (l + x)\h{x) (15) 
which means that the all one vector is a valid code word: 
c=[l,...,lf 



c(x) = 1 + X + 



1 



h{x) 



x + 1 

c{x)h{x) (16) 



Hence, the code generated by g{x) — x h ^ is a symmetric 
code and fulfills the requirement of Theorem Q] For the 
minimum distance of the code, note that the roots of h(x) 
form a subset of Q { £; thus, all the elements in GF(2" l )\g<£> 
are roots of g(x): 

~ ~ X + 2' < j < 2™ - 2 : g(ai) = (17) 



V 2 r 



Consequently, there exists an arithmetic progression of length 
2m-i _ 2 1 _ i among the powers of a in roots of g(x). As a 
result: 



>(2 



77? — 1 



1) 



1 



>?n — 1 



(18) 



In coding, it is usual to look for a code with maximum d 
given h, k. Here, we have designed a code with good d 
for a given h but with unknown k: 



min 
min 



k + deg(g(x)) 
h - deg(g(x)) 
(deg(g{x)) +deg(h(x))) 

deg(h{x))=\U { 2\ 



deg(g(x)) 



(19) 



The following theorem reveals how | should be calcu- 
lated. 

Theorem 2: With the previous terminology, \n!~^\ is equal 
to the number of binary sequences of length m such that if 
the sequence is written around a circle, between each two l's, 
there exists at least rh — I — 1 zeros. 

Proof. We show that there exists a 1-1 mapping be- 
tween the elements of Ti^ and the binary sequences. Let 
(bm-i, ■ ■ ■ ,bo) £ {0,1}™ be one of the binary sequences 
and let (3 be the decimal number that its binary representation 
coincides with the sequence: 



/ 9 = (fc A _ 1 ...6 )a= J2 b > T 



(20) 



i=Q 



We will show that oP £ . For the sake of simplicity, let us 
define (3j as the decimal number that its binary representation 
is the same as the sequence ((3) subjected to j units of left 
circular shift (/3q = 13): 



Pi 

02 



(b 



'rh— 1 



50)2 



{bm-3 ■ ■ ■ bobm-lbm-2)2 



0m-l ~ (bobfh-1 ■ ■ ■ ^1)2 
Now we have: 



(21) 



2(3j — 2 X (prh-l-j ■ ■ ■ bobm-lbm-j)2 



— 2 m bm-l-j + {bfn-2-j ■ ■ ■ bobm-lbm-j0)2 

= j+ i (mod 2™ - 1) 

= Vf3 (mod 2™ - 1) 



(22) 



which shows that {a@ j }j are conjugates of ofi . To show oP £ 
Ti^, we should prove that all the conjugates belong to QW, 
or equivalently, we should show < 0j < 2 m ~ 1 + 2 l — 1. It 
is clear that < (3j\ to prove the right inequality we consider 
two cases: 

1) MSB of (3-j is zero: 

brh-i-j = fa < 2™- 1 < 2" 1 - 1 + 2 l -l (23) 

2) MSB of (3j is one; therefore, according to the property 
of the binary sequences, the following m — l — 1 bits are 
zero: 

Om—l-j = 1 =r- bm—2-j = ■ ■ ■ = h-j = 

l-l 



(3, < 2" 1 - 1 +J2 2j 

3=0 

fa < 2 



rh—X i 



i 



(24) 



Up to now, we have proved that each binary sequence with 
the above zero-spacing property can be assigned to a separate 
root of h(x). To complete the proof, we show that if the binary 
representation of (3 does not satisfy the property, then we have 
a 13 ^ . In fact, by circular shifts introduced in j3j, all the 
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bits can be placed in the MSB position; thus, if the binary 
representation of (3 does not obey the property, at least one 
of the (3j's should be greater than 2 m ~ 1 + 2 l — 1. This means 
that at least one of the conjugates of a 13 does not belong to 

gQ m 

Theorem prelates the code parameter k to a combinatorics 
problem. Using this relation, it is shown in Appendix lAl that 

|«W| = o((^ + l)^J. 

B. Matrix Construction 

Recalling the arguments in the previous section, the choice 
of the polynomial g(x) depends on the choice of the primitive 
root. In addition to this degree of freedom, from Theorem[T] no 
matter which code vectors from complement sets are selected, 
the generated matrix satisfies RIP. Hence, for a given primitive 
element, there are 2 2 (there are 2 fc_1 complement pairs) 
possible matrix constructions. Among these huge number of 
possibilities, some of them have better characteristics for 
signal recovery from the samples. More specifically, we look 
for the matrices such that columns are closed with respect to 
the circular shift operation: if a = [cti, ■ ■ . , a^] T is a column of 
A, for all 1 < j < n, a, = [aj, Oj+i, . . . , an, ai, . . . , aj-i] T 
is also a column of A. 

The key point is that the BCH codes are a subset of cyclic 
codes, i.e., if c^ x i is a code vector, all its circular shifts are 
also valid code vectors. Thus, if we are careful in selecting 
from the complement sets, the generated sampling matrix will 
also have the cyclic property. For this selection, it should be 
noted that if aft X i,bft X i is a complement pair and Cf ix i is 
a circular shifted version of a^xi, the overal parity (sum of 
the elements in mod 2) of a^ x i and bs x i are different (each 
code vector has 2 m — 1 elements which is an odd number) 
while a^xi and c^ x i have the same parity. Therefore, if we 
discard the code vectors with even (odd) parity (from the set 
of all code vectors), we are left with a set half the size of the 
main set such that from each complement set exactly one is 
selected while the set is still closed with respect to the circular 
shift operation. The selection algorithm is as follows: 

1) For a given k (compressed sensing parameter), let % — 
[log 2 (fc)l and choose m> i (the number of compressed 
samples will be m — 2 rn ~ 1). 

2) Let Tt seq be the set of all binary sequences of length rh 
such that l's are circularly spaced with at least i zeros. 
In addition, let Hdec be the set of decimal numbers such 
that their binary representation is a sequence in Tt seq . 

3) Choose a as one of the primitive roots of GF(2 m ) and 
define: 

H = {a r \ re H d ec} (25) 

4) Define the parity check and code generating polynomials 

as: 

h(x) =Y[{x-r) (26) 

and 

s( x ) = 4tt^ (27) 

n(x) 



m 


h(x) 


4 


x b + a; 4 + x A + 1 


6 


x' + x° + x A + 1 


8 


x ia + x vt + x iu + x a + x s + x i + x 6 + 1 


10 


x Ab + x & + X M + X 1\S + x Lb + j.14 + x i6 + X V2. 
+ X 10 + X 9 + X 7 + X h + X 4 + X 3 + X + 1 



TABLE I 

Parity check polynomials for different values of m when i = 3. 




5 6 7 8 ~ 9 10 11 12 13 



m 

Fig. 1. Degree of h(x) for different values of m and i. 

5) Let A( 2 ™_i) X (2ti=9('>)-i) be the binary matrix composed 
of even parity code vectors as its columns, i.e., if 
columns are considered as polynomial coefficients (in 
GF(2)[x]), each polynomial should be divisible by 
(x + l)g(x) (the additional factor of x + 1 implies the 
even parity). 

6) Replace all the zeros in A by —1 and normalize each 
column to obtain the final compressed sensing matrix 

(A( 2 Tf._ 1 ) x ( 2 <ies(fc)-l))- 

For a simple example, we consider the case m = i. It 
is easy to check that the number of l's in each of the 
binary sequences in step |2] cannot exceed one. Therefore, 
we have Hdec = {0, 2°, 2 1 , 2 2 , . . . , 2 2 '" 1 }. This means that 
h(x), except for the factor (x + 1) is the same as the 
minimal polynomial of a (the primitive root). Since for code 
generation, we use (x + l)g(x) instead of g(x), the effective 
h(x) will be the minimal polynomial of a which is a primitive 
polynomial. In this case, the matrix A is the (2 l — 1) x [2 l — 1) 
square matrix whose columns are circularly shifted versions 
of the Pseudo Noise Sequence (PNS) output generated by the 
primitive polynomial (the absolute value of the inner product 
of each two columns of A is exactly 2j _ x ). 

Table U summarizes some of the parity check polynomials 
for i — 3 (useful for k < 8). Also, Fig. Q] shows the degree of 
h(x) for some of the choices of rh and i; the increasing rate of 
the degree is linear in the beginning but becomes exponential 
after a point. 

IV. Matrices with {0, 1, -1} Elements 

We have presented a method to generate RIP-fulfilling 
matrices with ±1 elements. In this section, we show how 
binary and bipolar matrices can be combined to produce 
ternary matrices with larger sizes. 
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In order to explain the concept, we consider the p 2 x p r+1 
binary matrices (p is a prime power) in ifTOj where each 
column consists of p ones (prior to normalization). 

It is evident that by changing some of the l's in the afore- 
mentioned matrix into — 1, the norm of the columns does not 
change; however, the inner products change. To show how we 
can benefit from this feature, let us assume that p = 2 4 ; thus, 
there are 2 l nonzero elements in each column. We construct 
a new matrix from the original binary matrix as follows: we 
repeat each column 2 4 times and then change the sign of the 
nonzero elements in the replicas in such a way that these 
nonzero elements form a Walsh-Hadamard matrix. In other 
words, for each column, there are 2 l columns (including itself) 
that have the same pattern of nonzero elements. Moreover, the 
nonzero elements of these semi-replica vectors are different 
columns of the Walch-Hadamard matrix. Thus, the semi- 
replica vectors are orthogonal and the absolute value of the 
inner product of two vectors with different nonzero patterns is 
upper-bounded by r (maximum possible value in the original 
matrix). Hence, the new matrix still satisfies the RIP condition 
with the same fc and S^. 

Although we have expanded the matrix with this trick, 
the change is negligible when the order of matrix sizes is 
considered (p 2 x p r+1 is expanded to p 2 x p r+2 ). In fact, the 
orthogonality of the semi-replicas is not a necessary condition; 
we only need that their inner products do not exceed r in 
absolute value. It shows that instead of the Walch-Hadamard 
matrix, we can use other ±1 matrices with more number 
of columns (with the same number of rows) such that their 
columns are almost orthogonal (inner product less than r). 
This is the case for the matrices introduced in the previous 
sections. 

In order to mathematically describe the procedure, we need 
to define an operation. Let s be a (3 x 1 binary vector with ex- 
actly a elements of 1 in locations r%, . . . , r a 6 {1,2,..., (3}. 
Also, let Xq, x i = [x\, . . . , x a ] T be an arbitrary vector. We 
define y^ x i = A*( s > x ) as: 



V 1 < j < a 



Vvj Xj 

,r a } : yj =0 
From the above definition, we can see: 



(/i(s,xi) , ai(s,x 2 )) = (xi,x 2 ) 



(28) 



(29) 



Furthermore, if the elements of both xi , x 2 lie in the closed 
interval [—1,1], we have: 

|(M S l> X l) i M s 2,x 2 ))| < (si,s 2 ) (30) 

For the matrix construction, let m be an integer such that 
p = 2 m — 1 is a prime (the primes of this form are called 
Mersenne primes). Let fc < p be the required order of the RIP 
condition and let: 



r io g 2 fc i 



(31) 



Also let S 



p2 xp r+l 



si 



Spr+i] be the binary 



RIP-fulfilling matrix constructed as in IfTOI and X 



(fc 



px2 k 

H { Z~ l) \ with the previous terminology) 



[X! ... A 2 

be the ±1 matrix introduced in the previous sections (we 



further normalize the columns of these matrices). We construct 
a new p 2 x (p r+1 .2 k ) matrix with elements in {0, 1,-1} by 
combining these two matrices: 



A = [/i(si,xj) 



(32) 



Employing the same approach as used before, we show that A 
satisfies the RIP condition of order fc, i.e., we show that the 
inner product of two different columns of A cannot exceed 
t^-t in absolute value while each column is normal: 

( t*(si,Xj) , ) = = 1 (33) 

To study the inner product of //(s^x^) and /i(sj 2 ,Xj 2 ), 
we consider two cases: 

1) «x = i 2 . In this case, since = Sj 2 , we have: 

|( A*( s ii) x ii) ) M( S i2; X j2) )| = |( X jl ) X j2 )| 



< 



(34) 



2) i\ 7^ i 2 and therefore, ^ Si 2 ; since the elements of 
both Xjj and x^ lie in [—1, 1], we have: 



< 
< 



1 



(35) 



Inequalities (O and ([35]) hold due to the RIP-fulfilling 
structure of the matrices X and S. Hence, the claimed prop- 
erty of the inner products of the columns in A is proved. 
Consequently, A obeys the RIP condition of order fc. 

V. Reconstruction from the Samples 

Matching Pursuit is one of the simplest methods for the 
recovery of sparse signals from sampling matrices (linear 
projections). Here we show that this method can exactly 
recover the sparse signal from noiseless samples. 

Let A mx „ and s nx i be the sampling matrix and the k- 
sparse signal vector, respectively. The sampling process is 
defined by: 



< i — A f , 



' s nx 1 



(36) 



For unique reconstruction of s nx i from the samples y mx i> 
it is sufficient that the sampling matrix A mx „ satisfies RIP 
of order 2fc |8). In this section, we show that if A mx „ is be 
any of the mentioned matrices (including the ones in [ 10]) and 
satisfies RIP of order 2k, the matching pursuit method can be 
used for perfect reconstruction. In addition, if A mxn has the 
circular structure in its columns, the computational complexity 
can be decreased (less than the ordinary matching pursuit). 

Let {ii, . . . ,ifc} C {1, . . . , n} be the nonzero locations in 
s nx i; thus, we have: 

k 

y mx i = A • s = 22 s lj a, 3 . (37) 
i=i 

where sn denotes the i th column in A. In the matching pursuit 
method, in order to find the nonzero locations in s, the inner 
products of the sample-vector (y) with all the columns of A 
are evaluated and then, the index of the maximum value (in 



7 



absolute) is chosen as the most probable nonzero location. 
Here, we show that the index associated with the maximum 
value is always a nonzero location. Without loss of generality, 
assume Is,, I > IsiJ > 



> \si k \. We then have: 



= I ^ s ij ( a ij i a h)\ 

J'=l 

fc 

— l s iil( a U! a ii) — ^ l s ijll( a iji a ii/ 
fc 



> SiA- 



> Si, - 



2k 

fc- 1 , , fc 



411 2k-r 111 ~ 2fc-i 

Now assume 2 G {1, . . . , n}\{ii, . . . , «&}: 
k 



\s h \ (38) 



< 



< 



k 



1 



2fc 



fc 

3=1 



< 



2fc- 1' 



Combining (|38l ) and (1391 ). we get 

fc 



< 



I Sij < (y> a *i> 



(39) 



(40) 



2fc- 1 

Hence, the largest inner product is obtained either with 
or one of the other &i 's. Therefore, in the noiseless case, we 
never select a nonzero location by using the matching pursuit 
algorithm, and finally we reconstruct the original sparse signal 
perfectly. 

In each recursion of the matching pursuit algorithm, the 
inner product of y mx i with all the columns in A mxn needs 
to be calculated. Each inner product requires m multiplications 
and m— 1 additions. Now we observe that the circular property 
of the columns of A can be useful. Let a be one of the 
columns in A and ay) be its j th circularly shifted version. We 
observe that {a^}j are all columns of A; thus, (a^',y) has 
to be calculated for all j. Let {a^ 1 ' , a' 2 ' , . . . , a^) } be different 
elements of {a^}j (obviously /i < m and more precisely 
/i|m). These inner products require /im multiplications and 
/i(m — 1) additions if directly calculated. 

An alternative approach for evaluation of these values 
is to employ Discrete Fourier Transform (DFT) or its fast 
implementation -FFT The key point in this approach is that 
the inner products can be found through circular convolution 
of y and a, i.e., 



(41) 



where ® rn represents the circular convolution with period m. 
It is well-known that the circular convolution can be easily 
calculated using DFT: if y/ and &f denote the DFT of y and 
a, respectively, we have: 



IDFT{y f &a f } = [y ® m a| 



I m — 1-1 



(42) 



where v mX i u mx i = [ui«i, . . . , v m u m ] T . For evaluation 
of the inner products in this way, y/ has to be calculated 
only once using DFT. Thus, excluding the calculation of yj 
(which is done only once), the inner products of y with 
{a^}j require one DFT, one I DFT and m multiplications. 
Since [i different circular shifts of a are possible, at most 
/i coefficients of a/ at equi-distant positions are nonzero; 
hence, /i-point DFT (and consequently IDFT) of a mx i rather 
than the general m-point DFT is adequate. For /i-point DFT 
of y, we can simply down-sample the evaluated m x 1 
vector of y/ (note that /i|m) and there is no need for an 
extra /i-point DFT. Employing the FFT version, we require 
2/i[log 2 /i] multiplications and m — /i + /i[log 2 /i] additions 
per /i-point DFT or IDFT. Comparing the number of required 
multiplications in calculation of the above /i inner products 
reveals the efficiency of the DFT approach; i.e., the required 
computational complexity for reconstruction of the signal from 
the samples obtained from the sampling matrix is less than the 
common amount for general matrices. It should be emphasized 
that this reduction in computational complexity is the result 
of the circular format of the columns. 



VI. Numerical Results 

For simulation results we have investigated the bipolar 
and binary matrices which satisfy RIP of order fc = 3. For 
binary matrices, the Devore's structure in [10] with the size 
of 64 x 512 is considered (matrices using OOC codes yield 
similar results) while for bipolar matrices using BCH codes, 
the matrix size is 63 x 512. In addition, we have included a 
64 x 512 Gaussian random matrix to compare our results with 
that of the random compressed sensing. 

Figure |2] shows the SNR of the reconstructed signal when 
different sparsity orders are considered. The samples, prior to 
reconstruction, are subject to additive white Gaussian noise 
such that the samples are obtained with SNR = lOOdB. 
We have employed the Orthogonal Matching Pursuit (OMP) 
for the reconstruction and the results are averaged for 5000 
different input signals using MATLAB simulations. As shown 
in Fig. |2 both matrices perfectly recover the sparse signal for 
fc — 3, and their performance degrade when fc increases (the 
matrices are designed for fc = 3). An interesting observation is 
that the bipolar matrix outperforms the other two when larger 
fc's are considered; for example, at k — 20, the reconstructed 
signal using the samples obtained from the bipolar matrix is 
almost 17 dB better on average than that of the random matrix 
and 25dB better than that of the Devore's Matrix. Furthermore, 
random matrices perform better than the simulated binary 
matrix. 

In order to include the noise effect in our results, we have 
considered the reconstruction at different noise levels in Fig. 
[3] using the same matrices at k = 20 (an overloaded value for 
the designed matrices). Again the OMP method is employed 
for reconstruction and the results are averaged over 5000 runs. 
The results confirm that the performance curve is continuous 
(stability) when noise is included. Since the curves for all the 
three matrices coincide for fc = 3, we did not include it. 
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Fig. 2. The SNR of the reconstructed signal for different sparsity values (fc) 
when the compressed samples are accompanied with 100 dB additive noise. 
Sampling matrices for BCH and Devore methods satisfy RIP of order 3. 
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Fig. 3. The SNR of the reconstructed signal for 20-sparse signals when the 
compressed samples are accompanied with different noise powers. Sampling 
matrices for BCH and Devore methods satisfy RIP of order 3 



VII. Conclusion 

Despite the enormous amount of literature in random sam- 
pling matrices for compressed sensing, deterministic designs 
are not well researched. In this paper, we introduce a new 
connection between the OOC codes and RIP fulfilling matrices 
which results in construction of binary sampling matrices. We 
have further presented a design for bipolar matrices using 
linear binary correction codes, especially BHC codes. In the 
latter design, we replace the zeros in the binary linear code 
vectors by —1 and use them as the columns of the sampling 
matrix in compressed sensing. The advantage of these matri- 
ces, in addition to their deterministic and known structure, is 
the simplicity in the sampling process; real/complex entries in 
the sampling matrix increases the computational complexity 
of the sampler as well as the required bit-precision for storing 
the samples. The linear codes for this purpose should have 
some desired characteristics; existence of such linear codes is 
proved by explicitly introducing binary BCH codes. One of 
the features of these matrices is that their generated samples 
can be easily decoded (using matching pursuit method) as 
the original sparse vector and due to the circular structure 
of the columns, the computational complexity in recovery 
can be reduced. These ±1 matrices are further expanded by 
considering {0,1,-1} elements; this expansion is achieved 
by combining the bipolar and binary matrices. Although the 



generated matrices show an improvement in the realizable 
size of the RIP-constrained matrices, the bound predicted by 
random matrices cannot be achieved. 

Appendix 
Evaluation of k 

In Theorem [2] we showed that k is equal to the number 
of binary sequences of length m such that no two Is are 
spaced by less than fh — I — 1 zeros (circular definition). To 
evaluate this number, let us define as the number of binary 
sequences of length b such that if the sequence is put around 
a circle, between each two l's, there is at least a zeros. In 
addition, let be the number of binary sequences such l's 
are spaced by at least a zeros apart (circular property is no 



longer valid for re^). We first calculate k^' and then we show 
the connection between and . 

There are two kinds of binary sequences counted in : 

1) The last bit in the sequence is 0; by omitting this bit, 
we obtain a sequence of length 6—1 with the same 
property. Also, each binary sequence of length 6—1 
with the above property can be padded by while still 
satisfying the required property to be included in 
Therefore, there are binary sequence of this type. 

2) The last bit in the sequence is 1; this means that the 
last a + 1 bits of the sequence are 0, . . . , 0, 1. Similar 

a 

to the above case, each binary sequence of length b — 



(a) 



a - 
0,. 



1 counted in can be padded by the block 

. , 0, 1 to produce a sequence included in . Thus, 



there are K^Z a _ x binary sequences of this type. 
In summary, we have the following recursive equation: 



J a ) _ ,J a ) i ^( a ) 

K >- — K b-1 "+" K b-a 



"b - n b-l n b-a-l ( 43 ^ 

Since for b < a + 1, there can be at most one 1 in the binary 
sequence, we thus have: 



1< b< a+ 1 



1 



equivalent to Kq - 
of as follows 



(44) 
a + 2) is 

1. If we define the onesided Z-transform 



From d43l ). the last initial condition (k£ 



(a) 



<- w (*)=E 



K b z > 



b=0 



it is not hard to check that: 



- 1 



1 



1 



(45) 



(46) 



Therefore, the increasing rate k^' with respect to b (b ^s> 1) 
has the same order as j b where 7 is the largest (in absolute 
value) root of f(z) = z a+1 - z a - 1. Since /(l) • /(2) < 0, 
there is a real root in (1 , 2); let us denote this root by 7. 
In fact, 7 is the largest root of f(z) (we do not prove this; 
however, if f(z) has a larger root, the increasing rate of ft^ 
would be greater than 7 b ): 



1< 7 <2 , /( 7 ) = 7 Q 



' 7 



1 = 



(47) 
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Since 7 > 1 we have Y L+1 > 1. For the sake of simplicity, 
let us define: 



5 4 7 a +! - 1 

We thus have: 

(1 + 6)- (1 + 5)^ = 1 

=> (1+5)^^(1+5)^ - 1^ = 1 

a 

=* (l + 5)*((l + 5)-l)=£> + 5)^ 

j'=o 

a 

=> 5(1 + 5)^ >y i + _ 

v y ~ ^ a + 1 

5(1 + 5) > a + l + |<5 
/, a- 2,2 ^ (a- 2) 2 i /a + 4 

r a + 1 a + 3 

5 > — — =*> l+5>— — 



(48) 



7 > 



a + 3\ -+ 



(49) 



Now we can show the connection between rf^ and k^. 
According to the definition of these parameters, we see that 
every bin; 
therefore: 



every binary sequence counted in is also counted in , 



_.(<*) < „(<») 



(50) 



In addition, if a sequence counted in is padded with a 

zeros at the end, it satisfies the requirements to be counted in 

» 



r h w , thus: 



(a) < (a) 
K b-a - T b 



(51) 



Combining the latter two inequalities, we get: 

0(7 & " a ) < r b (a) < C( 7 fc ) (52) 

The above equation in conjunction with the result in d49l ), 
yields: 

-In 

(53) 



a + 3V + 



The interpretation of the above inequality for k is as follows: 
k = rt" " I)'" ! (54) 



Figure |4] shows the asymptotic behavior of k[ q ' at different 
a values when b increases. For comparison, Fig. |4] suggests 

(5) h (5) 

that K b w 1.66 x 1.285 while our approximation is c ' > 
C(1.26 b ). 
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Fig. 4. Exact values of k!°' for different values of a and b. 
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